Exploratory data analysis and vizualizations

Tasks:

Tasks 1. Run this jupyter notebook step by step and try to understand what the script does.
2. Find one ore more new data sets (e.g. on Kaggle) and replace the car data set.
3. Repeat the exploratory data analysis and vizualizations based on the new data.
4. For spatial data analysis, replace the attribute 'residents_per_km2' by a new attribute.
5. Repeat the spatial data exploration based on the new attribute.

Save the jupyter notebook with your solutions as html-file and upload it to Moodle.

Importing the required libraries

Loading the data into a data frame

Checking the data types

Dropping the irrelevant columns

Renaming the columns

Dropping the duplicate rows

Counting and dropping the missing values

Showing summary statistics of (cleaned up) variables

Using boxplots for outlier detection

Calculate quantiles to obtain information about the distribution of each variable

Plotting a histogram to show the distribution of a variable

Using a density plot to show the distribution of a variable

Plotting a barchart to show the number of observations per category

Using a scatterplot to explore the relationship between two variables

Using a scatterplot-matrix to explore the relationships between more than two variables

Using a heat map to show the relationships between more than two variables

Using bubble plots to vizualize data

Radar chart to vizualize data

Exploring spatial data

Importing and exploring polygon-map in geojson format

Plotting the map

Plotting a subset of the map

Importing and exploring attribute data

Scatterplot matrix of attribute data

Using a choropleth map to explore the spatial pattern of a variable

Further readings and tips

Seaborn:

https://jakevdp.github.io/PythonDataScienceHandbook/04.14-visualization-with-seaborn.html

Matplotlib:

https://www.machinelearningplus.com/plots/top-50-matplotlib-visualizations-the-master-plots-python/

Exploratory data analysis

https://towardsdatascience.com/15-data-exploration-techniques-to-go-from-data-to-insights-93f66e6805df

Spatial data and maps:

https://ipyleaflet.readthedocs.io/en/latest/

https://python-visualization.github.io/folium/quickstart.html

https://deparkes.co.uk/2016/06/10/folium-map-tiles/

https://nbviewer.jupyter.org/gist/talbertc-usgs/18f8901fc98f109f2b71156cf3ac81cd

https://www.nagarajbhat.com/post/folium-visualization

https://ocefpaf.github.io/python4oceanographers/blog/2015/03/23/wms_layers/